This markdown document provides a condensed overview of methods for doing spatial data analysis and map-making in R for the Battles Lab. I’m aiming for a broad-but-shallow introduction, with the idea that you can pursue individual topics more deeply if they’re relevant to your work. This document borrows heavily from the excellent free ebook Geocomputation with R by Robin Lovelace. I’ll editorialize with my own experience and add additional resources that have been helpful to me.
Pros:
Cons:
terra and stars) but still requires being thoughtful about where/how the computer is storing and processing data in situations where you could just blindly plug and play in ArcGIS.Go-to packages:
library(here) # for easy filepathing
## here() starts at C:/Users/dfoster/Documents/spatial_intro
library(raster) # for raster data
## Loading required package: sp
library(sf) # for vector data
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'pillar'
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'tibble'
## Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
library(ggplot2) # for mapmaking
library(tidyverse) # for data munging, including of sf objects
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'hms'
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble 3.1.4 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## v purrr 0.3.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x tidyr::extract() masks raster::extract()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## x dplyr::select() masks raster::select()
Other useful packages:
library(terra) # next-gen version of raster/sf, with a less developed guidance/support
## Warning: package 'terra' was built under R version 4.1.2
## terra version 1.4.22
##
## Attaching package: 'terra'
## The following object is masked from 'package:dplyr':
##
## src
## The following object is masked from 'package:purrr':
##
## simplify
## The following object is masked from 'package:tidyr':
##
## extract
## The following objects are masked from 'package:raster':
##
## adjacent, animate, area, boundaries, buffer, cellFromRowCol,
## cellFromRowColCombine, cellFromXY, clamp, click, colFromCell,
## colFromX, cover, crop, crosstab, crs, crs<-, distance, erase,
## extend, extract, flip, focal, freq, geom, hasValues, init,
## inMemory, interpolate, mask, modal, mosaic, ncell, ncol<-, nrow<-,
## origin, origin<-, plotRGB, rasterize, readStart, readStop, rectify,
## res, res<-, resample, RGB, rotate, rowColFromCell, rowFromCell,
## rowFromY, setMinMax, setValues, shift, stretch, symdif, terrain,
## trim, values, values<-, writeRaster, writeStart, writeStop,
## writeValues, xFromCell, xFromCol, xmax, xmax<-, xmin, xmin<-, xres,
## xyFromCell, yFromCell, yFromRow, ymax, ymax<-, ymin, ymin<-, yres,
## zonal, zoom
# ecosystem but better abilities to handle big data
# https://rspatial.org/terra/pkg/index.html
library(stars) # next-gen version of raster, IMO not as developed as terra
## Loading required package: abind
# https://r-spatial.github.io/stars/
library(tmap) # another mapping package, particularly useful for data exploration
# https://cran.r-project.org/web/packages/tmap/vignettes/tmap-getstarted.html
I highly recommend these resources for doing spatial analysis in R. I’ll flag other relevant resources and packages in specific sections:
There are two main types of spatial data:
# load DX area of interest from a shapefile
dx_aoi =
sf::st_read(here::here('data', 'small', 'dxstudyarea.shp'))
## Reading layer `dxstudyarea' from data source
## `C:\Users\dfoster\Documents\spatial_intro\data\small\dxstudyarea.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 335850.6 ymin: 4045386 xmax: 342623.8 ymax: 4051330
## Projected CRS: NAD83 / UTM zone 11N
The metadata tells us that there’s only a single feature (point, line, or polygon) with six “fields” (columns of data). You can print the metadata any time by calling the sf object:
# print metadata
dx_aoi
## Simple feature collection with 1 feature and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 335850.6 ymin: 4045386 xmax: 342623.8 ymax: 4051330
## Projected CRS: NAD83 / UTM zone 11N
## OBJECTID Id gridcode Shape_Leng Shape_Area area_ha
## 1 1 1 1827 60099.38 17046296 1704.63
## geometry
## 1 MULTIPOLYGON (((336799.6 40...
Usefully this also tells us what the coordinate reference system (CRS) of the data is.
We can do a basic plot, which gives one panel per field:
plot(dx_aoi)
This isn’t that helpful though, and the tmap library is great for providing Arcmap style interactive maps, using a ggplot style syntax:
tmap::tmap_mode('view') # sets the viewer to interactive; the alternative "plot"
## tmap mode set to interactive viewing
# used for standard figure-style maps
tmap::tm_shape(dx_aoi)+ # tell tmap what data to use
tmap::tm_borders(col = 'blue')+ # draw just polygon borders with a blue color
# we could also have set col = colname, where colname is the name of a field in
# the shapefile to color by some variable. altneratives are tm_polygon() to have
# fill, tm_dots() for point data, and tm_lines() for linear features
tmap::tm_scale_bar() # add a scale bar
Now we have an interactive map with some nice baselayers for context. We will add additional layers later.
Often vector data will be stored in a geodatabase, which we can also read using sf:
# get the names of all the layers in the geodatabase
sf::st_layers(here::here('data', 'big', 'fire20_1.gdb'))
## Driver: OpenFileGDB
## Available layers:
## layer_name geometry_type features fields
## 1 firep20_1 Multi Polygon 21318 17
## 2 rxburn20_1 Multi Polygon 6412 16
## 3 Non_RXFire_Legacy13_2 Multi Polygon 864 16
frap_fires =
# note that you have to specify the layer name if reading from a geodatabase;
# if you don't it will read in the first layer by default.
sf::st_read(here::here('data',
'big',
'fire20_1.gdb'),
layer = 'firep20_1')
## Reading layer `firep20_1' from data source
## `C:\Users\dfoster\Documents\spatial_intro\data\big\fire20_1.gdb'
## using driver `OpenFileGDB'
## Simple feature collection with 21318 features and 17 fields
## Geometry type: GEOMETRY
## Dimension: XY
## Bounding box: xmin: -373237.5 ymin: -604727.6 xmax: 539438.2 ymax: 518283.7
## Projected CRS: NAD83 / California Albers
Most raster data types are handled well by the raster package. You might also use terra or stars, especially for big datasets.
A single raster layer can be loaded using the raster function:
mmi = raster::raster(here::here('data',
'small',
'mmi_sum5_sc408ss_2016.bsq'))
Again we can look at metadata easily:
mmi
## class : RasterLayer
## dimensions : 1777, 4501, 7998277 (nrow, ncol, ncell)
## resolution : 30, 30 (x, y)
## extent : 213855, 348885, 4034985, 4088295 (xmin, xmax, ymin, ymax)
## crs : +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs
## source : mmi_sum5_sc408ss_2016.bsq
## names : mmi_sum5_sc408ss_2016
## values : 0, 255 (min, max)
## attributes :
## ID category
## from: 0 0: No Canopy Loss / No Disturbance
## to : 112 112: Non-Processed Landcover Type / Unavailable
This raster has attributes, meaning that the actual cell data values (integers from 0 to 255) are mapped to categories. In this case, the cell data values map directly to % mortality:
head(levels(mmi)[[1]])
## ID category
## 1 0 0: No Canopy Loss / No Disturbance
## 2 1 1 % loss
## 3 2 2 % loss
## 4 3 3 % loss
## 5 4 4 % loss
## 6 5 5 % loss
And basic plotting is also easy:
plot(mmi)
Tmap again offers a nice interactive map:
tmap_mode('view')
## tmap mode set to interactive viewing
tmap::tm_shape(mmi)+
tmap::tm_raster()+
tmap::tm_scale_bar()
## stars object downsampled to 1592 by 628 cells. See tm_shape manual (argument raster.downsample)
## Warning: Number of levels of the variable "NA" is 113, which is
## larger than max.categories (which is 30), so levels are combined. Set
## tmap_options(max.categories = 113) in the layer function to show all levels.
Note that tmap has automatically downsampled the large raster to make it plot in a reasonable amount of time.
Net CDF files are a common file format for raster stacks:
terraclimate = raster::stack(here::here('data',
'big',
'TerraClimate_vpd_2020.nc'))
## Loading required namespace: ncdf4
## Warning in .getCRSfromGridMap4(atts): cannot process these parts of the CRS:
## long_name=crs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition
It didn’t like something about the projection here, let’s take a closer look:
terraclimate
## class : RasterStack
## dimensions : 4320, 8640, 37324800, 12 (nrow, ncol, ncell, nlayers)
## resolution : 0.04166667, 0.04166667 (x, y)
## extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +ellps=WGS84 +no_defs
## names : X2020.01.01, X2020.02.01, X2020.03.01, X2020.04.01, X2020.05.01, X2020.06.01, X2020.07.01, X2020.08.01, X2020.09.01, X2020.10.01, X2020.11.01, X2020.12.01
Seems OK. Note that this raster stack has 12 bands, one for each month.
plot(terraclimate) # plot them all
# plot individual layers
plot(terraclimate[[1]])
The basic metadata of either a raster or sf object will tell you the coordinate reference system. You can also extract it specifically:
# just the basics
dx_aoi
## Simple feature collection with 1 feature and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 335850.6 ymin: 4045386 xmax: 342623.8 ymax: 4051330
## Projected CRS: NAD83 / UTM zone 11N
## OBJECTID Id gridcode Shape_Leng Shape_Area area_ha
## 1 1 1 1827 60099.38 17046296 1704.63
## geometry
## 1 MULTIPOLYGON (((336799.6 40...
# the full version
st_crs(dx_aoi)
## Coordinate Reference System:
## User input: NAD83 / UTM zone 11N
## wkt:
## PROJCRS["NAD83 / UTM zone 11N",
## BASEGEOGCRS["NAD83",
## DATUM["North American Datum 1983",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4269]],
## CONVERSION["UTM zone 11N",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["Degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-117,
## ANGLEUNIT["Degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## ID["EPSG",26911]]
mmi
## class : RasterLayer
## dimensions : 1777, 4501, 7998277 (nrow, ncol, ncell)
## resolution : 30, 30 (x, y)
## extent : 213855, 348885, 4034985, 4088295 (xmin, xmax, ymin, ymax)
## crs : +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs
## source : mmi_sum5_sc408ss_2016.bsq
## names : mmi_sum5_sc408ss_2016
## values : 0, 255 (min, max)
## attributes :
## ID category
## from: 0 0: No Canopy Loss / No Disturbance
## to : 112 112: Non-Processed Landcover Type / Unavailable
raster::crs(mmi)
## CRS arguments:
## +proj=utm +zone=11 +datum=WGS84 +units=m +no_defs
Note that the raster object has a much simpler representation for the CRS. The raster is using the old style proj4string representation of a CRS, while sf has been updated to the new much-more-explicit version. The switchover has been a big deal in the spatial open software community. In my experience, so far packages have been pretty clever about speaking either CRS language and converting when needed (sometimes with warnings). However, things may break at some point in the future, and google is your friend.
Reprojecting either a raster or a vector object can be very slow if the object is large and/or complex, so often you’ll want to try to crop / downscale / subset the data before trying to reproject it.
# load the DX plot data
dx_plots =
st_read(here::here('data',
'small',
'DX_plots_29orig.shp'))
## Reading layer `DX_plots_29orig' from data source
## `C:\Users\dfoster\Documents\spatial_intro\data\small\DX_plots_29orig.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 29 features and 18 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 336428.8 ymin: 4046051 xmax: 341506.8 ymax: 4050174
## Projected CRS: WGS 84 / UTM zone 11N
# test equality of the CRS
st_crs(dx_plots) == st_crs(dx_aoi) # false; one is wgs84 and another is NAD83
## [1] FALSE
We can reproject/transform the coordinate reference system, usually so that multiple objects are on the same system and we can have them interact.
# with vector data
# reproject the plots to match the aoi
dx_plots =
dx_plots %>%
sf::st_transform(crs = st_crs(dx_aoi))
# now they're on the same CRS
st_crs(dx_plots) == st_crs(dx_aoi)
## [1] TRUE
# with raster data
vpd_small =
raster::projectRaster(from = terraclimate[[1]], crs = 'EPSG:26911')
## Warning in rgdal::rawTransform(projfrom, projto, nrow(xy), xy[, 1], xy[, : 3
## projected point(s) not finite
## Warning in rgdal::rawTransform(projto_int, projfrom, nrow(xy), xy[, 1], : 460
## projected point(s) not finite
The use of the EPSG string is usually the quickest and easiest way to define a projection under both the old and new systems, and the common CRSes are very google-able.
We also can set the coordinate reference system, without reprojecting. You want to be very careful doing this, and usually only use it when for some reason the data doesn’t know the CRS but you do:
dx_busted = dx_aoi
st_crs(dx_busted) = NA
dx_busted
## Simple feature collection with 1 feature and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 335850.6 ymin: 4045386 xmax: 342623.8 ymax: 4051330
## CRS: NA
## OBJECTID Id gridcode Shape_Leng Shape_Area area_ha
## 1 1 1 1827 60099.38 17046296 1704.63
## geometry
## 1 MULTIPOLYGON (((336799.6 40...
st_crs(dx_busted) = st_crs(dx_aoi)
dx_busted
## Simple feature collection with 1 feature and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 335850.6 ymin: 4045386 xmax: 342623.8 ymax: 4051330
## Projected CRS: NAD83 / UTM zone 11N
## OBJECTID Id gridcode Shape_Leng Shape_Area area_ha
## 1 1 1 1827 60099.38 17046296 1704.63
## geometry
## 1 MULTIPOLYGON (((336799.6 40...
Cropping both rasters and polygons is much faster if you crop by a rectangular extent than if you’re trying to crop by a polygon. If you’re trying to crop a big dataset down to a more manageable size you almost always want to do a rough first pass using a rectangular extent, and then go back and get your fine clipping later.
# start with a big MMI raster
plot(mmi)
# set values greater than 100 to NA
mmi[mmi>100] = NA
# first do a rough rectangular crop to a buffered AOI; note st_bbox() to get the
# rectangular extent of the polygon
mmi_aoi = raster::crop(mmi,
dx_aoi %>%
st_buffer(100) %>%
st_bbox())
# rough rectangular crop is fast
plot(mmi_aoi)
# now mask by the smaller raster by the polygon
mmi_masked =
raster::mask(mmi_aoi, st_buffer(dx_aoi,100))
#
plot(mmi_masked)
The built in attribute data for this raster annoying map 0 and NA as the same color, but they are different in the underlying data.
plot(mmi_masked==0)
# first, reproject the small AOI polygon to match the CRS of the big fire polygons
# dataset; we do this even though we want to ultimately use the CRS of the aoi
# polygon because its much faster to reproject the small dataset
frap_fires_clipped =
# start with frap fires
frap_fires %>%
# was getting a weird error message googling it leads to this stack exchange
# thread https://stackoverflow.com/questions/61404151/error-when-using-st-intersects-in-cpl-geos-binopst-geometryx-st-geometryy
# which suggests using st_cast()
st_cast('MULTIPOLYGON') %>%
# rough rectangular crop; note that we first reproject the aoi polygon to
# match the CRS of the fire polygons (we change the AOI polygon because its
# smaller and simpler, and thus faster to reproject)
sf::st_crop(
dx_aoi %>%
# buffer the aoi by 1000 units; the units for the current CRS are meters
st_buffer(1000) %>%
# transform the aoi
st_transform(crs = st_crs(frap_fires))) %>%
# reproject the clipped polygons to match that of the aoi
st_transform(crs = st_crs(dx_aoi)) %>%
# for nicer plotting, make year a numeric
mutate(year = as.numeric(as.character(YEAR_)))
## Warning: attribute variables are assumed to be spatially constant throughout all
## geometries
plot(frap_fires_clipped)
## Warning: plotting the first 9 out of 18 attributes; use max.plot = 18 to plot
## all
head(frap_fires_clipped)
## Simple feature collection with 6 features and 18 fields
## Geometry type: GEOMETRY
## Dimension: XY
## Bounding box: xmin: 334848.3 ymin: 4044168 xmax: 343268.1 ymax: 4052491
## Projected CRS: NAD83 / UTM zone 11N
## YEAR_ STATE AGENCY UNIT_ID FIRE_NAME INC_NUM ALARM_DATE
## 18473 2003 CA NPS KNP GIANT 00000000 2003-07-28 17:00:00
## 18533 1926 CA NPS KNP KAWEAH 00000000 1926-08-09 16:00:00
## 18537 1921 CA NPS KNP ELK CREEK 00000000 1921-07-04 16:00:00
## 18539 1922 CA NPS KNP HOSPITAL 00000000 1922-08-12 16:00:00
## 18578 1947 CA NPS KNP MORO CR 00000000 1947-08-04 16:00:00
## 18587 1926 CA NPS KNP POTWISHA 00000000 1926-07-03 16:00:00
## CONT_DATE CAUSE COMMENTS REPORT_AC GIS_ACRES C_METHOD
## 18473 2003-12-12 16:00:00 1 275 275.4718 1
## 18533 1926-08-18 16:00:00 7 86000 34357.7031 6
## 18537 1921-07-16 16:00:00 4 1600 1551.4668 8
## 18539 1922-08-16 16:00:00 4 550 667.1011 8
## 18578 1947-08-07 16:00:00 8 350 132.1309 8
## 18587 1926-07-09 16:00:00 3 500 881.0719 8
## OBJECTIVE FIRE_NUM Shape_Length Shape_Area
## 18473 2 00000048 5772.193 1114794.7
## 18533 1 00000001 80754.142 139040698.4
## 18537 1 00000001 13140.427 6278563.4
## 18539 1 00000003 8521.353 2699662.4
## 18578 1 00007017 4021.935 534714.9
## 18587 1 00000010 9902.657 3565571.5
## Shape year
## 18473 POLYGON ((342313.6 4048491,... 2003
## 18533 MULTIPOLYGON (((335790.3 40... 1926
## 18537 POLYGON ((336900.3 4045657,... 1921
## 18539 POLYGON ((341760.3 4045357,... 1922
## 18578 POLYGON ((342600.3 4046527,... 1947
## 18587 POLYGON ((340290.3 4044637,... 1926
tm_shape(frap_fires_clipped)+
# add the fire polygons, using the year for fill color and transparency of 0.5
tm_polygons(col = 'year', alpha = 0.5)+
tm_shape(dx_aoi)+
tm_borders('blue')
If you have multiple rasters (or raster layers) and want to perform some per-pixel operation its pretty easy, as long as the rasters line up:
# get the average vpd value from january and february 2020
vpd_avg =
(terraclimate[[1]]+terraclimate[[2]])/2
plot(vpd_avg)
# TRUE if vpd in a pixel is above 2, FALSE otherwise
vpd_above_2 = terraclimate[[1]] > 2
plot(vpd_above_2)
We can ‘mask’ one layer by another.
# make a version of the vpd_avg raster, but only showing cells where
# VPD in january is above 2
masked_avg =
raster::mask(x = vpd_avg,
mask = vpd_above_2,
maskvalue = FALSE)
plot(masked_avg)
We can downsample a numeric raster to make it coarser resolution and easier to work with:
aggregated_vpd =
raster::aggregate(x = terraclimate[[1]],
fact = 30)
plot(aggregated_vpd)
DEMs are a common type of raster and there’s a built in function terrain for calculating all sorts of useful products from a DEM:
if (!file.exists(here::here('data', 'small','seki_dem.tif'))){
# download the DEM
seki_dem =
elevatr::get_elev_raster(locations =
as(dx_aoi %>% st_buffer(100), 'Spatial'),
prj = st_crs(dx_aoi)$proj4string,
z = 13)
# crop the DEM
seki_dem = raster::crop(seki_dem, dx_aoi %>% st_buffer(100))
# save the DEM
raster::writeRaster(seki_dem,
filename =
here::here('data',
'small',
'seki_dem.tif'))
} else {
# just load the DEM if its already downloaded
seki_dem =
raster::raster(here::here('data', 'small','seki_dem.tif'))
}
plot(seki_dem)
seki_terrain =
raster::terrain(seki_dem,
opt = c('slope', 'aspect', 'TPI', 'TRI', 'roughness','flowdir'))
plot(seki_terrain)
Raster algegra is useful for getting a southwestness index:
seki_aspect =
seki_dem %>%
raster::terrain(opt = 'aspect',
unit = 'degrees')
# make a new raster where the value of every pixel is abs(current_value-225)
seki_southwestness =
abs(seki_aspect-225)
# for every pixel where the value is greater than 180, get a new value
# (which is 360 - current value of cells where the value is > 180)
seki_southwestness[seki_southwestness>180] =
360-seki_southwestness[seki_southwestness>180]
# for every pixel, divide the value by 180
seki_southwestness = seki_southwestness/180
plot(seki_aspect)
plot(seki_southwestness)
It’s also easy to do moving-window type analyses with the raster package:
seki_dem_avg =
raster::focal(x = seki_dem,
# w defines the size of the window in pixels; see
# help(focal)
w = matrix(nrow = 91, ncol = 91, data = rep(1, times = 8281)),
# give each cell the mean value of all its neighbors; you can
# supply other functions here, including custom functions
fun = mean,
na.rm = TRUE)
plot(seki_dem)
plot(seki_dem_avg)
One of the nice things about the sf package is how nicely it plays with tidyverse style selection, mutation, summarization, etc.:
names(frap_fires)
## [1] "YEAR_" "STATE" "AGENCY" "UNIT_ID" "FIRE_NAME"
## [6] "INC_NUM" "ALARM_DATE" "CONT_DATE" "CAUSE" "COMMENTS"
## [11] "REPORT_AC" "GIS_ACRES" "C_METHOD" "OBJECTIVE" "FIRE_NUM"
## [16] "Shape_Length" "Shape_Area" "Shape"
# start with frap fires
big_frap_fires =
frap_fires %>%
# keep only fires greater than 100000 acres
filter(GIS_ACRES >= 100000)
plot(big_frap_fires['YEAR_'])
# get the total area in hectares of the big fires in each year
big_frap_fires_aggregated =
big_frap_fires %>%
mutate(area_ha = GIS_ACRES * 0.404) %>%
group_by(YEAR_) %>%
summarise(area_ha = sum(area_ha)) %>%
ungroup()
A lot of the time, weird errors about broken geometry can be fixed with one of two common tricks:
big_frap_fires_aggregated =
# start with potentially broken geometry
big_frap_fires_aggregated %>%
# buffer by 0
st_buffer(0) %>%
# st_make_valid()
st_make_valid()
Those two together take care of most busted geometries.
Suppose we want a raster of the fire perimeters in our SEKI AOI:
frap_fires_clipped_raster =
raster::rasterize(x = frap_fires_clipped,
field = 'year',
y = seki_dem)
plot(frap_fires_clipped_raster)
See also the fasterize package for a faster more efficient version of this function, which can be very slow for large polygons.
We can also convert rasters to polygons:
plot(seki_dem)
seki_above_1600 =
# frequently, we want to aggregate a raster before converting it
raster::aggregate(seki_dem, fact = 10) >= 1600
plot(seki_above_1600)
seki_above_1600_sf =
# convert to a SpatialPolygonsDataFrame
rasterToPolygons(seki_above_1600, dissolve = TRUE) %>%
# convert to an sf object
sf::st_as_sf()
## Loading required namespace: rgeos
seki_above_1600_sf
## Simple feature collection with 2 features and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 335753.4 ymin: 4045219 xmax: 342728.1 ymax: 4051427
## Projected CRS: NAD83 / UTM zone 11N
## layer geometry
## 1 0 MULTIPOLYGON (((340735.3 40...
## 2 1 MULTIPOLYGON (((342038.3 40...
tm_shape(seki_above_1600_sf)+
tm_polygons('layer')
seki_dem_df =
seki_dem %>%
as.data.frame(xy = TRUE)
head(seki_dem_df)
## x y seki_dem
## 1 335757.2 4051423 1548.856
## 2 335764.9 4051423 1540.908
## 3 335772.6 4051423 1532.685
## 4 335780.2 4051423 1524.572
## 5 335787.9 4051423 1517.304
## 6 335795.6 4051423 1510.806
frap_fires_df =
frap_fires %>%
as.data.frame() %>%
select(-Shape) # usually is select(-geometry)
head(frap_fires_df)
## YEAR_ STATE AGENCY UNIT_ID FIRE_NAME INC_NUM ALARM_DATE
## 1 2020 CA CDF NEU NELSON 00013212 2020-06-17 17:00:00
## 2 2020 CA CDF NEU AMORUSO 00011799 2020-05-31 17:00:00
## 3 2020 CA CDF NEU ATHENS 00018493 2020-08-09 17:00:00
## 4 2020 CA CDF NEU FLEMING 00007619 2020-03-30 17:00:00
## 5 2020 CA CDF NEU MELANESE 00008471 2020-04-13 17:00:00
## 6 2020 CA CDF NEU PFE 00014858 2020-07-04 17:00:00
## CONT_DATE CAUSE COMMENTS REPORT_AC GIS_ACRES C_METHOD OBJECTIVE
## 1 2020-06-22 17:00:00 11 110.0 109.60250 1 1
## 2 2020-06-03 17:00:00 2 670.0 685.58502 1 1
## 3 2020-02-29 16:00:00 14 26.0 27.30048 1 1
## 4 2020-03-31 17:00:00 9 13.0 12.93155 1 1
## 5 2020-04-18 17:00:00 18 10.3 10.31596 1 1
## 6 2020-07-04 17:00:00 14 36.0 36.70193 1 1
## FIRE_NUM Shape_Length Shape_Area
## 1 <NA> 3252.523 443544.67
## 2 <NA> 9653.760 2774464.03
## 3 <NA> 1649.643 110481.13
## 4 <NA> 1577.156 52332.11
## 5 <NA> 1035.788 41747.22
## 6 <NA> 2348.114 148527.45
# get the elevation at each DX plot
dx_plots$elev_m =
raster::extract(seki_dem,
dx_plots)
tm_shape(seki_dem)+
tm_raster(palette = 'viridis', style = 'cont')+
tm_shape(dx_plots)+
tm_dots(col = 'elev_m')+
tm_scale_bar()
# include df = TRUE so that we get a dataframe with polygon IDs
burned_elev =
raster::extract(seki_dem,
frap_fires_clipped,
df = TRUE)
# the ID column gives the polygon ID, the row number in the polygons table;
# there is one row per raster cell
head(burned_elev)
## ID seki_dem
## 1 1 2012.342
## 2 1 2015.766
## 3 1 2020.600
## 4 1 2024.312
## 5 1 2026.531
## 6 1 2028.331
Use st_intersection() (or less frequently st_difference() or st_sym_difference()) to get the intersection, difference, or symmetrical difference between sf objects. For example, if we want to flag the DX plots as above or below 1600m elevation:
# start with points and polygons
tm_shape(seki_above_1600_sf)+
tm_polygons('layer')+
tm_shape(dx_plots)+
tm_dots()
# convert the 0-1 layer to a more clear TRUE/FALSE flag
seki_above_1600_sf =
seki_above_1600_sf %>%
mutate(above_1600 = layer==1) %>%
select(-layer)
# use st_intersection() to extract teh polygon values to the plot points
dx_plots =
dx_plots %>%
st_intersection(seki_above_1600_sf)
## Warning: attribute variables are assumed to be spatially constant throughout all
## geometries
# now dx_plots has a column for above_1600, which was extracted from the polygons
head(dx_plots)
## Simple feature collection with 6 features and 20 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 336795.5 ymin: 4046051 xmax: 340315.2 ymax: 4049838
## Projected CRS: NAD83 / UTM zone 11N
## Site Plot Date Crew Latitude_D Longitude_ Datum_Form UTM_N
## 6 SEKI 13 2017-06-14 EK, DW, GV 36.57542 -118.7890 WGS84 4049264
## 19 SEKI 16 2017-06-22 LC, PM, DW 36.57546 -118.8241 WGS84 4049327
## 1 SEKI 10 2017-06-23 EK, KS, GV 36.54653 -118.7841 WGS84 4046051
## 2 SEKI 8 2017-06-22 EK, KS, GV 36.56500 -118.8017 WGS84 4048129
## 3 SEKI 26 2017-06-22 EK, KS, GV 36.54968 -118.8154 WGS84 4046453
## 4 SEKI 27 2017-07-02 EK, KS, GV 36.58047 -118.7974 WGS84 4049838
## UTM_E Elevation_ Slope Aspect RootRot InPlot NearPlot Distance RootRotNot
## 6 339933.5 1595.018 25 106 N NA NA NA NA
## 19 336795.5 1592.000 51 72 N NA NA NA NA
## 1 340315.2 1588.008 102 290 N NA NA NA NA
## 2 338772.8 1595.628 58 10 N NA NA NA NA
## 3 337516.4 1603.553 74 290 N NA NA NA NA
## 4 339196.0 1803.502 33 80 N NA NA NA NA
## OtherNotes elev_m above_1600 geometry
## 6 NA 1582.724 FALSE POINT (339933.5 4049264)
## 19 NA 1571.039 FALSE POINT (336795.5 4049327)
## 1 NA 1617.868 TRUE POINT (340315.2 4046051)
## 2 NA 1598.534 TRUE POINT (338772.8 4048129)
## 3 NA 1601.352 TRUE POINT (337516.4 4046453)
## 4 NA 1818.514 TRUE POINT (339196 4049838)
Dropping gridded sample points over some area of interest is easy:
gridded_plots =
dx_aoi %>%
# see the function help for options for a hexagonal grid, a grid based on
# corners rather than centers, or to return grid polygons instead of points
st_make_grid(
# alternatively, you can use the cellsize option to set the grid spacing
n = c(10, 10),
what = 'centers',
square = TRUE,
)
# by default this gives us a grid over the whole bbox, so use st_intersection
# to only keep points in the aoi
tm_shape(dx_aoi)+
tm_borders(col = 'black')+
tm_shape(gridded_plots)+
tm_dots('red')
gridded_plots =
gridded_plots %>%
# take an inner buffer of 100m to avoid putting plots right on the boundary
st_intersection(dx_aoi %>%
st_buffer(-100))
tm_shape(dx_aoi)+
tm_borders(col = 'black')+
tm_shape(gridded_plots)+
tm_dots('red')
We can also place plots randomly, and even stratify them by polygon:
tm_shape(frap_fires_clipped)+
tm_borders('red')
fire_plots =
frap_fires_clipped %>%
st_sample(
# or just 'size = 10' to sample 10 points distributed across all the polygons
size = rep(10, times = nrow(frap_fires_clipped)),
# can also be 'regular' or 'hexagonal'
type = 'random',
# if false, it'll give you almost the right number of points
exact = TRUE
) %>%
st_as_sf()
head(fire_plots)
## Simple feature collection with 6 features and 0 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 342421.9 ymin: 4047263 xmax: 343138.4 ymax: 4048391
## Projected CRS: NAD83 / UTM zone 11N
## x
## 1 POINT (342653.6 4047992)
## 2 POINT (342421.9 4048391)
## 3 POINT (342794.8 4047263)
## 4 POINT (343138.4 4047582)
## 5 POINT (343018.3 4047932)
## 6 POINT (342484.3 4047330)
# this is a good example lfor how helpful the zoomable interactive map is for
# checking out spatial data
tm_shape(frap_fires_clipped)+
tm_borders('red')+
tm_shape(fire_plots)+
tm_dots('blue')
For static maps like you’d use in a presentation or paper, I prefer ggplot over tmap, because I find it more flexible to get the details exactly the way you want them.
ggplot()+
# add raster data; note that you have to extract it to a dataframe with x
# and y coordinates first; make sure the CRS matches that of the vector data
# you want to use so that things line up
geom_raster(data = seki_dem_df,
aes(x = x, y = y, fill = seki_dem))+
scale_fill_viridis_c(option = 'C')+
# add vector data
geom_sf(data = dx_aoi,
fill = NA,
color = 'black',
lwd = 2)+
coord_sf()+
theme_minimal()+
# nice scale bar using ggspatial package
ggspatial::annotation_scale()
Basic interactive basemaps are pretty easy to do with tmap, but getting nice basemaps working for a static plot like you’d use in a paper or presentation is not easy. Here’s an example using the package basemaps:
basemaps::basemap_ggplot(ext = st_buffer(dx_aoi, 100),
map_service = 'esri',
map_type = 'world_imagery')+
geom_sf(data = st_transform(dx_aoi, 'EPSG:3857'),
fill = NA,
lwd = 2,
color = 'yellow')+
theme_bw()+
ggspatial::annotation_scale()+
theme(axis.title.x = element_blank(),
axis.title.y = element_blank())
## Loading basemap 'world_imagery' from map service 'esri'...
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.
See the references section.